Week 7: Simple Regression

PSC 8101 Lab

Setting Up For Simple Regression

  • Install “jtools” if you haven’t already (line 3)
3   install.packages("jtools")

Setting Up For Simple Regression

  • Load all packages (lines 5-9)
library(haven)
library(tidyverse)
library(psych)
library(jtools)
library(skimr)

Setting Up For Simple Regression

  • Load “states” data
states <- read_dta("states.dta")

Setting Up For Simple Regression

  • We can work with a reduced dataset if we want:
statesub <- states |> select(state, conpct_m, womleg_2017)

Visualize Relationship - Smoothed Line

states |>
  ggplot(aes(x=conpct_m, y=womleg_2017)) + 
  geom_point() +
  geom_smooth()

Visualize - Line of Best Fit

states |>
  ggplot(aes(x=conpct_m, y=womleg_2017)) + 
  geom_point() +
  geom_smooth(method="lm")

Remove Error Band

states |>
  ggplot(aes(x=conpct_m, y=womleg_2017)) + 
  geom_point() +
  geom_smooth(method="lm", se=FALSE)

Correlation

cor(states$womleg_2017, states$conpct_m)
[1] -0.5848877


# tidyverse command
summarize(states, cor(womleg_2017, conpct_m))
# A tibble: 1 × 1
  `cor(womleg_2017, conpct_m)`
                         <dbl>
1                       -0.585

Descriptive Statistics, skimr Package

skim(states, womleg_2017, conpct_m)

Descriptive Statistics, skimr Package

Data summary
Name states
Number of rows 50
Number of columns 201
_______________________
Column type frequency:
numeric 2
________________________
Group variables None

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist
womleg_2017 0 1 25.03 7.64 11.10 19.47 24.80 30.47 40.00 ▆▇▇▇▃
conpct_m 0 1 33.97 5.34 22.38 30.25 33.78 38.26 46.83 ▂▇▇▇▁

Boxplot

ggplot(states, aes(y=womleg_2017)) + 
  geom_boxplot() +
  theme_minimal()

Estimate Simple Regression Model

lm(womleg_2017 ~ conpct_m, data = states)

Call:
lm(formula = womleg_2017 ~ conpct_m, data = states)

Coefficients:
(Intercept)     conpct_m  
    53.4650      -0.8369  

Base R summary: Save Model as Object and Summarize

model1 <- lm(womleg_2017 ~ conpct_m, data = states)
summary(model1)

Call:
lm(formula = womleg_2017 ~ conpct_m, data = states)

Residuals:
     Min       1Q   Median       3Q      Max 
-13.7444  -3.9821  -0.8816   4.4722  14.4329 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  53.4650     5.7599   9.282 2.74e-12 ***
conpct_m     -0.8369     0.1675  -4.996 8.18e-06 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 6.265 on 48 degrees of freedom
Multiple R-squared:  0.3421,    Adjusted R-squared:  0.3284 
F-statistic: 24.96 on 1 and 48 DF,  p-value: 8.176e-06

Or

summary(lm(womleg_2017 ~ conpct_m, data = states))

Call:
lm(formula = womleg_2017 ~ conpct_m, data = states)

Residuals:
     Min       1Q   Median       3Q      Max 
-13.7444  -3.9821  -0.8816   4.4722  14.4329 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  53.4650     5.7599   9.282 2.74e-12 ***
conpct_m     -0.8369     0.1675  -4.996 8.18e-06 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 6.265 on 48 degrees of freedom
Multiple R-squared:  0.3421,    Adjusted R-squared:  0.3284 
F-statistic: 24.96 on 1 and 48 DF,  p-value: 8.176e-06

Report Results, jtools

summ(model1)

Report Results, jtools

Observations 50
Dependent variable womleg_2017
Type OLS linear regression
F(1,48) 24.96
0.34
Adj. R² 0.33
Est. S.E. t val. p
(Intercept) 53.47 5.76 9.28 0.00
conpct_m -0.84 0.17 -5.00 0.00
Standard errors: OLS

Or, jtools

summ(lm(womleg_2017 ~ conpct_m, data = states))

Or, jtools

Observations 50
Dependent variable womleg_2017
Type OLS linear regression
F(1,48) 24.96
0.34
Adj. R² 0.33
Est. S.E. t val. p
(Intercept) 53.47 5.76 9.28 0.00
conpct_m -0.84 0.17 -5.00 0.00
Standard errors: OLS

Interpretation of a regression coefficient

  • Rise over run
  • Generic for +b: For a 1-unit increase in X, Y increases by b units.
  • -b: For a 1-unit increase in X, Y decreases by b units.
  • Our example: What are our units of measurement? What is 1 unit?
  • For a 1% increase in a state’s conservatism, the percentage of women in the state legislature decreases by 0.84%.

Interpretation of a regression coefficient

  • For a 1% increase in a state’s conservatism, the percentage of women in the state legislature decreases by 0.84%.

  • For a 10% increase in a state’s conservatism,….

  • For a 10% increase in a state’s conservatism, the percentage of women in the state legislature decreases by 8.4%.